\section{Related Work}
Most previous authorship identification work is on source code level. 
The piloting work of authorship on binary code [2] treats program as the unit of classification. 
They assume that each program is written by a single author, which is not practical in the real world. 
They define six types of program provenance features to abstract binary code. Our work is based on it. 
The most significant difference is that we provide finer granluarity on authorship classification: we assume function is the unit.

